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Abstract 

§: 

bi-correlated flat fading MIMO systems equiped with MMSE receivers. The channel state information and 
the second order statistics of the channel are assumed available at the receiver side and at the transmitter 

On 

side respectively. As the direct maximization of the EMI needs the use of non attractive algorithms, it is 



proposed to optimize an approximation of the EMI, introduced recently, obtained when the number of 
transmit and receive antennas t and r converge to oo at the same rate. It is established that the relative 



X 

error between the actual EMI and its approximation is a O(jy) term. It is shown that the left singular 
eigenvectors of the optimum precoder coincide with the eigenvectors of the transmit covariance matrix, 
and its singular values are solution of a certain maximization problem. Numerical experiments show that 
the mutual information provided by this precoder is close from what is obtained by maximizing the true 
EMI, but that the algorithm maximizing the approximation is much less computationally intensive. 



I. Introduction 

It is now well established that using multiple transmit and receive antennas potentially allows to increase 
the Shannon capacity of digital communications systems. Since the seminal work of Teletar ([17"]), the 
ergodic Shannon capacity of block fading MIMO systems has been studied extensively and important 
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questions related to the design of optimal precoding schemes have been addressed. Considering that the 
Channel State Information (CSI) is available at receiver side while the transmitter is only aware of its 
second order statistical properties, many authors have studied the impact of antenna correlation on the 
capacity of MIMO systems communicating through flat fading channel (0, iPTOll ) and frequency selective 
channel ( |[T2l ). 

The ergodic Shannon capacity is certainly a valuable figure of merit if the MIMO system under 
consideration is equipped with a maximum likelihood decoder. As the practical implementation of this 
decoder requires a high computational cost, it is also useful to study potential performance of MIMO 
systems equiped with the MMSE receiver. The corresponding (Gaussian) ergodic mutual information 
(EMI), denoted I mmse in the following, is defined as the sum over the transmit antennas of the terms 
E(log(l + j3j)), where (3j represents the output MMSE SINR associated to the stream sent by antenna j. 
The design of precoders maximizing L mrnS e is of course an important issue because the optimum value 
of Immse represents the maximum rate that can be transmitted reliably when the MIMO system uses the 
MMSE receiver. This optimization problem has been extensively studied in the past, mainly if the CSI 
is available at the both the receiver and the transmitter (see e.g. lfl31 ). It is however often unrealistic to 
assume the CSI available at the transmitter side in the context of mobile systems. 

In the present paper, we consider a flat fading MIMO channel with separable correlation structure 
(Kronecker model). We assume that the channel matrix is known at the receiver side, but that only its 
transmit and receive covariance matrices are available at the transmitter side. We address the problem 
of designing precoders that maximize Immse- The expression of I mmse is rather complicated and thus 
difficult to maximize w.r.t. the precoding matrix. In particular, it seems difficult to establish that the left 
eigenvectors of an optimal precoding matrix coincide with the eigenvectors of the transmit correlation 
matrix as in the context of the evaluation of the Shannon ergodic capacity (see e.g. (Q). Therefore, it is 
necessary to evaluate numerically both the singular values and the singular vectors of optimum precoding 
matrices, or equivalently to solve a t 2 dimensional optimization problem. Steepest descent algorithms 
require the use intensive Monte Carlo simulation technics in order to evaluate the gradient and/or the 
Hessian of the cost function (see e.g. |20] in the context of the evaluation of the Shannon capacity of 
correlated Rician channels). Moreover, the convergence of these algorithms is not guaranteed because 
I mmse is in general not concave. As in previous contributions addressing the behaviour of the Shannon 
capacity of MIMO systems ( 11211 . HI . |fT6ll ), we propose to replace the maximization of I m mse by the 
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maximization of an approximation obtained in the asymptotic regime t — > oo, r —> oo, £ — ► c, c S (0, oo). 

Large system approximation of I m mse was previously considered in the context of CDMA systems 
with i.i.d. spreading codes (see e.g. lfl8l and the references herein), which, in the downlink, are formally 
equivalent to a subclass of the MIMO systems considered in this paper when the spreading codes are 
Gaussian. The specific case of MIMO systems has also been considered (see e.g. 0, |[2ll ). It was shown 
that the SINRs (/3j)i=l,-,t converge towards deterministic terms depending on the transmit and receive 
covariance matrices (or their equivalent in the context of downlink CDMA systems). These results provide 
an obvious large system approximation Immse of Immse- 

In this paper, we establish that the large system approximation Immse provides a O(j) relative error. 
This is a rather poor convergence rate compared to the large system approximations of the Shannon 
capacity whose relative errors are O(tj) (El, @). We therefore propose to use an improved large 
system approximation, denoted J mmse , first introduced in [9] in the case of independent identically 
distributed (i.i.d.) MIMO channels, and then generalized independently in the conference papers CQ 
and [13]. The derivations of lTT3l are based on the replica method, a useful and powerfull trick whose 
mathematical relevance has not yet been established in the present context, and thus differ from the large 
random matrix approach sketched in 12. We show that the relative error associated to I mmS e is a O(p-) 
term, thus improving the predictions of [1] (0(7^72) Q) and lTT3l (o(j)). The method we use to study 
the accuracy of I mmse differs from [9] whose approach is somewhat similar to HI, a paper devoted to 
the asymptotic study of the SINRs (Pj)j=i,...,t- The transmit covariance matrices of the MIMO channels 
of are diagonal. This assumption simplifies the analysis so that the approach of (9), HI cannot be 
generalized to the case of general transmit covariance matrices. Next, we address the maximization of 
Immse wxt. the precoding matrix. We establish that the left singular vectors of an optimum precoder are 
the eigenvectors of the transmit covariance matrix and that its right eigenvectors matrix is equal to I t . 
The evaluation of a precoding matrix thus reduces to the evaluation of its singular values, a t-dimensional 
optimization problem. In general, the optimum singular values have no closed form expression. In order 
to get more insights on the optimum precoders, we consider the case of an uncorrelated MIMO channel 
for which it is possible to obtain in closed form the precoders which optimize the approximation I mmse . 
We show that the optimum precoders are the diagonal matrices whose entries are either 0, either all 

'The authors wish to thank Aris Moustakas for suggesting that the rate 0(7372) was probably pessimistic 
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coincide with - where s is the number of non zero entries which depend on the signal to noise ratio. 
Therefore, the optimum transmission strategy coincides with an antenna selection scheme. Although it is 
not proved that the above strategy maximizes Immse, this result shows that, at least if t is large enough, 
antenna selection may provide higher mutual informations Immse than a uniform power allocation. The 
situation differs from what was shown initially by Telatar ( iPTTIO in the context of the study of the Shannon 
ergodic capacity of i.i.d. channels: the Shannon capacity achieving covariance matrix coincides with I t . 
We also remark that our result establishes formally that I mmse is in general not a concave function of the 
precoding matrix, and infer from this that I mmse is not concave as well. We finally consider the case of an 
arbitrary bicorrelated MIMO channel, and propose to evaluate the singular values of an optimum precoder 
using a classical gradient algorithm. Numerical results show that the precoding matrices evaluated by this 
algorithm provide nearly the same mutual informations as direct approaches maximizing I mmse while 
being computationally more attractive. 

This paper is organized as follows. Section [TT] is devoted to presentation of the problem and to the 
underlying assumptions. In section [Till we present the large system approximations Immse and Immse 
of Immse and analyse their accuracies. Section [TV] studies the structure of the optimum precoders, and 
Section [V] addresses the optimization of Immse- 

General Notations In this paper, the notations y, x, M stand for scalars, vectors and matrices, 
respectively. As usual, ||x|| represents the Euclidian norm of vector x and ||M|| stands for the spectral 
norm of matrix M. The superscripts (.) T and {.) H represent respectively the transpose and transpose 
conjugate. The trace of M is denoted by Tr(M). The mathematical expectation operator is denoted by 
E(-). The symbols 3? and 9 denote respectively the real and imaginary parts of a given complex number. 
If x is a possibly complex-valued random variable, Var(x) = E|x| 2 — |E(x)| 2 represents the variance of 
x. 

All along this paper, t and r stand for the number of transmit and receive antennas. Certain quantities 
will be studied in the asymptotic regime t — > oo, r — ► oo in such a way that £ — ► c £ (0, oo). In 
order to simplify the notations, t — > oo should be understood from now on as t — > oo, r — > oo and 
- — ► c € (0, oo). A vector x t and a matrix M t whose size depend on t are said to be uniformly bounded 
if sup t ||x t || < oo and sup 4 ||M t || < oo. 

Several variables used throughout this paper depend on various parameters, e.g. the number of antennas, 
the noise level, etc. In order to simplify the notations, we do not mention all these dependencies. 



November 2, 2009 



DRAFT 



5 



Notation C will denote a generic strictly positive constant whose main feature is not to depend on t. 
The value of C might change from one line to another. 

II. Problem statement. 

We consider a MIMO system equipped with r receive antennas and t transmit antennas. The MIMO 
channel matrix H is supposed to be a Gaussian random matrix defined by 

H = — ^C^HjjdC^/ 2 (1) 

where is a r x t matrix whose entries are independent and identically distributed (i.i.d.) complex 
circular Gaussian random variables CN(0, 1), i.e. Kad^j = ^Ha^ij + iQH iic nj where IkHa^ij and 
QH ii( i : ij are independent centered real Gaussian random variables with variance i. Matrices Ct and Cr 
are positive definite matrices modeling respectively the impact of correlation between transmitting and 
receiving antennas. We assume that j Trace (Ct) = 1 and ^ Trace (Cr) = 1. This assumption implies 
that ±E(Tr(HH^)) = 1. 

Each transmit antenna j sends a sequence (xj(n)) nG z defined by 

x(n) = (xi(n), . . . ,x t (n)) T = Ks(n) = K(si(n), . . . ,s t (n)) T 

where the {(sj{n)) n( - z ) t are assumed to be unit variance mutually independent i.i.d. sequences. K 
represents a precoding matrix satisfying jTr(KK^) < 1. 

The corresponding r-variate discrete-time received signal (y(n)) ngZ is given by 

y(n) = HKs(n) + n(n) (2) 

where n is a white Gaussian noise with covariance matrix E (n(n)n(n) ) = a 2 I r . 

In this paper, we evaluate the potential performance of the MIMO system © when the receiver is 
equiped with the MMSE receiver. In other words, each symbol sequence Sj is estimated by the Wiener 
filter prior to decoding, i.e. Sj(n) is estimated by 

Sj{n) = kfU H (HKK H H H + a 2 !,)" 1 y(n) 

where kj represents the column j of K. In the following, we denote by Qj>(K) the matrix 

Qr(K) = (K^H^HK + a%) ^ (3) 

It is standard that the SINR 0j provided by this linear receiver is given by ( |fT9l ) 
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The ergodic mutual information I mmse {K) of the MIMO system under consideration is thus equal to 



; (K)=E 



-E 



J>g (a 2 Q T (K)) jV 
i=i 



(5) 



J>g(l + / 3,(K)) 

where the mathematical expectation is over the probability distribution of random matrix H. In order 
to maximize I mmse (K) over the set jTr(KK^) < 1, it is necessary to use numerical technics based 
on stepeest descent algorithms. As the gradient and the Hessian of I mmse have no simple expression, 
they have to be evaluated using intensive Monte Carlo simulations (see e.g. |20l ). Moreover, to our best 
knowledge, the singular vectors of an optimum matrix have no closed form expression. Therefore, the 
dimension of the optimization problem cannot be reduced from t 2 to t as in the context of the evaluation 
of the capacity achieving covariance matrix (Q). 



III. Derivation of the large system approximation of I mmse . 

In this section, we introduce the large system approximation presented in [1] and [13], and improve 
the results stated without proof in [1] concerning its accuracy. Our approach is based on Gaussian large 
random matrix technics initiated by Pastur ( Ifl4l0 . Pastur's approach was used in in order to establish 
the asymptotic Gaussianity of the traditional mutual information of bicorrelated MIMO channels. 



We study in this section the asymptotic behaviour of I mmse in the case where the precoding matrix 
K is reduced to K = I t to simplify the notations. In order to deduce the results in the case K ^ I t , 
we remark that channel matrix HK can be interpreted as a bi-correlated MIMO channel with transmit 
and receive covariance matrices K^C^K and Cr respectively. We will therefore replace matrix Ct by 
matrix K^CyK. I mmse (I) and Qt(I) are denoted I mmS e and Qt in the remainder of this section. 

We first explain the differences between our analysis and the contributions [9] and [8] . We recall that 
Q adresses the i.i.d. case while JH assumes that matrix Ct = Diag(cT,i, • • • , cx,t) is diagonal. In this 
last context, the SINR (3j can also be written as 

((i) \ ^ 

r l/2-rr(i) C T TT(i)H r l/2\ , _ C T ,j jj (j) 

where represents the column j of H^, matrix is obtained from by deleting column j, 
and Cj? represents the (t — 1) X (i — 1) diagonal matrix obtained by deleting column and row j from 

Ct- The approaches of [9] and [8] rely on the key observation that vector is independent from the 

(i) 

matrix . This allows to study the behaviour of f3j using important results concerning the behaviour 
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of random quadratic forms. If matrix Ct is non diagonal, (3j has not the same structure than in ©: 
vector and matrices are replaced by non independent terms, and the approach of J9l and JH 
cannot be used. Our approach does not study 0j directly, but rather the diagonal entries of matrix <r 2 Q T 
whose asymptotic behaviour can be evaluated for general transmit co variance matrices Ct- 

The study of the accuracy of the approximation is essentially based on the study of a virtual channel 
obtained from H after unitary transformations. We consider the eigenvalue/ eigenvector decompositions 
of covariance matrices Ct and C^: 

C T = UDU H , C fi = UDU fl (7) 

where the diagonal entries (dj)i=i,...,t an ^ (^i)i=i,...,r of D and D are arranged in the decreasing order. 
Then, we define the random t x r matrix Y by 

Y H = U ff HU (8) 

Y can be written as 



1 

Vt 



D l/2 xf) l/2 



(9) 



where X represents the txr matrix X = U^H^U. As U and U are unitary, matrix X is an i.i.d. 
complex Gaussian matrix such that E|Xj,-| 2 = 1. In the following, we denote by Q the matrix defined 



by 



Q = (YY^ + a z l) (10) 

The study of l mm se when t — ► oo is based on the asymptotic properties of the diagonal entries of matrix 
Qt- We remark that 

Q r = UQU^ (11) 

and evaluate the asymptotic behaviour of uQu ff where u = (ui, . . . ,u t ) is a unit norm deterministic 
row vector. We use in the following certain results of 16]. We however note that in 0, matrix Q is 
replaced by matrix (I + a 2 YY^) _1 . Therefore, the statements of have to be adapted. In the sequel, 



-l 



(12) 



we denote by 5 and 5 the unique strictly positive solutions of the system 

5 = iTr D {a 2 (I + <5D) X 
5 = \Ti f>(a 2 (I + <5D) 
The existence and the uniqueness of the solution has been established in Proposition 1 of 0. We denote 
by T and T the diagonal matrices 



T 
t 



a 2 (I + 5T>) 
a 2 (I + 51)) 



(13) 
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and gather in the following proposition certain useful results of 0. 

Proposition 1: Assume that matrices D and D satisfy the following conditions: 

sup t ||D|| < d max < oo , inf t jTrD > 
sup t ||D|| < d max < oo , inf t i Tr D > 
Then, the following results hold true: 

• For each uniformly bounded deterministic matrix M |^| 

Var(±Tr(MQ)) = 0(£) 
E (Tr (M(Q — T))) = 0(±) 



(14) 



7 and 7 defined by 



satisfy 



7 = ^Tr (D 2 T 2 ) , 7 = ^Tr (d 2 T 2 



irif (l - a 77) > 



(15) 



(16) 



(17) 



We assume from now on that the matrices D and D satisfy (fl4"l ). We are now in position to state the 
main results of this section. We begin by the following proposition. 
Proposition 2: 



1 



sup |E(u(Q-T)u H )|=0(^. 



u, u =1 



and 



Moreover, 



sup 

u,||u||=l 



E(u(Q-E(Q))u") 3 =0{^) 



sup 

u,||u|| = l 



Var (uQu 



1 °^ fuT 2 DuV 



(18) 



(19) 



(20) 



t 1 — o" 4 77 

Finally, if we denote by {vk)k=i,...,t the row vectors of any unitary matrix V, and if («j)i=l •>* denote 
positive numbers such that sup.,- Kj < C, we have 



^ Var ( Kfc v fc Qvf ) - - /]. ( Kfc v,T 2 Dvf ) 



o(-) 



(21) 



The proof is given in the Appendix. In order to introduce the large system approximation I mmse , we 
define matrices TV and Tr by 

T r = UTU H = (ct 2 (I + <5C t ))~\ T fl = UTU H = (<7 2 (I + £Cfl)) _1 (22) 



2 In (6), matrix M is diagonal. The case of non diagonal matrices is addressed in |5| devoted to correlated Ricean channels. 
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We note that (5, 5) and (7, 7) can be expressed in terms of TV and Tr as 



C T (a 2 (I + <5C T ) 
C R (a 2 (I + 5C R )) 



±Tr [C T T T ] 
±Tr[C*Tfl] 



and 



7=ilV(C|T|), 7 = iTr(C|T|) 



(23) 



(24) 



The following result holds. 

Theorem 1: We define l mmse by 



3=1 ]=1 \ 



((a 2 T, 



Then, 



lmmse ~i~ 0( ^ ) 



(25) 



(26) 



Proof. The proof is based on a second order expansion of logo^QTjj around the point E(Qyjj) . We 
define £j by 



and write log a 2 QT.j,j as 



We express log(l + ej) as 



loga 2 Q TiiJ = log (a 2 E(Q Tj , ,•)) + log(l + e,-) 



e 2 e 3 

l0g(l + ei ) = 6,-^ + 1+^ 



As E(ej) = 0, E(log(7 2 Qx' ) jj) can be written as 

E(loga 2 Q Tijj ) = log (<7 2 E(Qtj\j)) " ^(e 2 ) + ^E(e 3 ) + E(» 

In order to be able to use Proposition O we have to study the behaviour 
it exists a deterministic constant C > such that 

1 



sup 



< C 



/) (2V) 
We first remark that 

(28) 



Indeed, 



1 



< E 



Qtjj 



cr 2 E(l + f3j) by the Jensen inequality. We denote by hj the column j 



of matrix H. The SINR f3j provided by the MMSE receiver is upperbounded by the match filter bound, 
i-e. (3j < ^F-- As it is clear that sup^ E(||hj || 2 ) < C, we get d28l . 
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We use (|2TI ) with Kj = E (q* .) and when the unitary matrix V coincides with matrix U. We obtain 
immediately that 



t 1 A~ 

i=i 



Uj T 2 Duf 



We now establish that 



sup 

3 



1 



a 4 77 ^E(UjQuf ; 
1 



o ( i, 



For this, we first notice that 



Indeed, 



1 <C 



< 



c 



1 



T /,M 

1 ItWt^ ^ rd„ 



,3,3 



The conclusion follows 5 < ^±Tr(D) < (|30) follows directly from 

1 1 _ T rjJ -E(Q TjJ ) 

E(Q Tii)i ) T T m TtjMQtm) 



and from (U8t . 

( f29T > and ((30) imply that 



E E ^ 



1 a 4 7 



Ui T 2 Duf 



0(- 



Moreover, £13 and (|28]) lead to 



t 1 - ct 4 77 I u 7 Tuf 



supE(ef) < § 

In order to evaluate the influence of rj, we give the following lemma, proved in the appendix. 
Lemma 1: 

sup|E(r,)| =0(4) 

■7 C 



32) and (J34J imply that 



^logE(a 2 Q TjJ ) + - T 



1 ct 4 7 



(j 4 77 * 



E 



u,T 2 Duf 
u,-Tu? 



+ o(- 



=1 V "-3^3 



Straightforward manipulations show that 

t 

E 



* ^u,T 2 Duf\ 2 ■ ' 



1 



,=1 \ "i^f 



((a 2 T 



<r 2 T 



T,j,j 



In order to establish Theorem \T\ it remains to prove that 



t t 
£logE(<r 2 Q Tii .,■) = E lo S (^ T ^) + O(t) 



i=i 



i=i 



(29) 



(30) 



(31) 



(32) 



(33) 



(34) 



(35) 
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We define ej as 



, _T TliJ -E(Q TJ)i ) 



E(Qt,jj) 

and remark that 

log (o^Ttj-j) = log E(a 2 Q TjJ ) + log(l + e,-) 



Using (1281) . we obtain that 

I^-I^CITtjj-ECQt^)! (36) 

(|T8T > implies that sup^ |T T jj - E(Q T ,j,j)\ = O(^). By ([36l we get that sup^- = O(^). For t 
large enough, |e 3 -| < A < 1 for each j. For these i, we can write log(l +ej) as 

l0g(l +6j) = 6j + fj 

where 

oo -n—2 
~ — (- \2 S^-f t\n-l 3 
1 j 



n 

By <Q1U>, it holds that sup^e 2 = O(^). Therefore, 



n 

n=2 



n=2 



Consequently, 



We finally remark that 



^log(l + e i ) = Ee i + 0(l; 
i=i i=i 



where kj = e(q^— y- The second item of (fl5l) can thus be used for matrix M = kju^uj, thus 
showing that (l35l) holds. This completes the proof of (f26b . 



We denote I mmse the term defined by 

t 

Immse = " £ M^TtJj) (37) 

Immse corresponds to the obvious large system approximation of I mm se obtained by replacing, for each j, 
(l + Pj) by its "deterministic equivalent" (a 2 Tyj <,j) ■ Theorem Q] shows that the relative error provided 
by Immse is a 0(|) term, while the relative error of Immse is a O(p-) term. 
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Fig. 1. Accuracy of the large system approximant 

We now present some simulation experiments which demonstrate the accuracy of the approximation 
Immse for a realistic number of antennas. I mmse is also represented. The transmit antennas correlation 
matrix Ct is generated according to the popular model proposed in (2|, i.e. 

-in(k—l) cos 4> T g— \ (n(k—l) sin <f> T a^ T ) 2 



Ct,w = ae 



(38) 



where a is a constant chosen in such a way that jTr(C T ) = 1. <h and cj^ t can be interpreted as 
the mean angle of departure and the standard deviation of the angles of departure of a scatterer cluster 
respectively. We notice that if a^ T ~ 0, then Rank(Cj") ~ 1. We refer the reader to [2] for more details. 

The receive antennas correlation matrix is generated similarly with different parameters 4>r and a$ R . 

In Figure 1 we have represented I m mse, Immse, Immse versus the SNR for r = t = 4. Here, the 
various parameters are equal to <f)T = 7r/4, cr0 T = 0.5, 4>r = Tr/12,a ( j, R = 0.5. We observe that 
Immse can be rather far from the true mutual information I m mse evaluated by Monte-Carlo simulation 
over 1000 channel realizations. Figure 2 represents the relative error between I mmse and I mmse , Immse 
respectively in terms of the mean angle of departure variance o\ t for SNR = dB and SNR = 6 dB when 
<\>T = tt/4, 4>r = 7r/12,(T^ J? = 0.4 . Figures 1,2 show that approximation I m mse provides significantly 
better results than l mmse . 



The expression (T25l ) is a large system approximation of I mmse (I). If the precoding matrix K is not 
equal to I, the approximation of I mrnse (K) is obtained by replacing matrix Ct by matrix K^C^K. In 
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Fig. 2. Relative error 



the following we denote by 6(K),6(K), T T (K),T R (K),j(K),j(K),I mmse (K)J mmse (K) the values 
of parameters 6,6, T T ,T R ,-f,^,l mmse ,i mmse when C T is replaced by K H C T K. 

IV. Structure of optimal precoders. 

In this section, we study the problem of designing precoders maximizing function I mrnse (K) over the 
set K defined by 

K = {K, ^Tr(KK H ) < 1} (39) 

The main result of this section states that there is no restriction to look for optimal precoders of the form 
K = UD-^A 1 / 2 where A is a diagonal matrix with positive elements. In order to establish this, we 
first derive the following intermediate result. 

Proposition 3: Let K by an element of K, and the eigenvalue/eigenvector decomposition of matrix 
rl 

K H C T K = WAW fl 



K H C r K 



Then, matrix K<j = KW belongs to /C and satisfies 

^mmse(K) ^ Immse (K d ) (40) 
Proof. It is obvious that 6 K,. In order to establish (l40l . we denote by J mrnse (K) the term 



~ ( j "25(K)2 1-^ 7 (K)7(K)t^V * 2 T T>j)j (K) J (41) 
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and remark that I mmse (K) = I mmse (K) + J mmse (K). We prove in the following that j mmse (K) < 

Immse(J^-d) &nd t/ mmse (K) ^ J mmse (J£. d ). 

We first remark that K^C^K^ is the diagonal matrix A. Therefore, by (1221) matrix o- 2 T T (K d ) is also 
diagonal, and is given by 



<7 2 T T (K d ) = I + 5(K d )A 



Moreover, 



cr 2 T T (K) = Wa 2 T T (K d )W 



H 



(42) 



We claim that (<J(K),tf(K)) = (<5(K d ), <5(K d )). To check this, we recall that (5(K),<5(K)) are defined 
as the unique positive solutions of the system 

1. 



5(K) 



TrK^C T K 



t 



<7 2 (I + (5(K)K H C T K) 



~5(K) = ±TrC R [a 2 (I + 6(K)C R )] 1 



while (5(K d ),5(K d )) are the positive solutions of 



1 



TrA 



a 2 (I+~5(K d )A) 



-i 



i L 

8{K d ) = ^TrC R [a 2 (I + 6(K d )C R )]~ l 

As K H C T K = WAW H , for each k > 0, we have 

-j-TYK^C T K [<r 2 (I + kK^CtK)] _1 = ^TrA [a 2 (I + kA)] 

Therefore, (<5(K), #(K)) and (J(K^), jQK^)) are positive solutions of the same system. The uniqueness of 
the solutions yields (<5(K), <5(K)) = (S(K d ), <5(K d )). From this, it is easy to check that (7(K), 7(K)) = 
(7(K d ),7(K d )). I mmse (K) can thus be written as 

log I = ' 

Wli. MTC , A 

i=i 

(I + S(K d ) WAW H )~j is given by 



f - e(K) = ^ l0 4(^WWAW-)- 



(I + i^WAW^g = £ r 



fc - 1 + <5(K d )A fc 

where Wj k is the entry (j, k) of unitary matrix W and where A = Diag(Ai, . . . , At). The function 
y —* log - is convex on M + . As Ylk=i \ Wj±\ 2 = 1 (because W is unitary), we have 



log 



Sfc=i 



\w. 



l+<5(K d )A fc 



<^l^i,fe| 2l °g(l + ^(K d )A fe ) 



fc=i 
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Summing over j, and using that Y^j=i l^j,fc| 2 = 1> we g et tnat 

t 

I mmse (K) < ^log(l + 5(K d )X k ) = i mmse (K a 

k=l 

We now establish that J mmse (K) < J mmse (K d ). 
We recall that 

ST TJ , j( K) = ± J^f 

^ 1 + c)(K d )A fc 

Similarly, we have 

[(a 2 T T ) 2 l . . (K) = Y 

As Ylk=i \ Wj,k\ 2 = 1> the convexity of function x — > x 2 implies that 

[(a 2 T T (K)y 2 < [(a 2 T T (K)) 2 }^ 

This implies that 

- 1 * 

J mmS e(K) <-J](l- <T 2 T Tijj (K)) 2 



i=i 

We also remark that 



1 - a 2 T Tj , ,(K) = w,(I - a 2 T T (K d ))wf 

where Wj represents the row j of W. As matrix I — a 2 T(K d ) is diagonal and matrix W is unitary, we 

have ^ 

£ ( Wi (I - <7 2 T T (K d ))wf ) 2 < £ (l - a 2 T TjJ (K,)) 2 
i=i i=i 

We finally note that 

25(K d ) 2 l-a^mK d ) = ^™(K d ) 

J mmse (K) <7 mmse (K d ) follows from the equalities (<5(K), <5(K)) = (5{K d ), S(K d )) and ( 7 (K),7(K)) = 
(7(K^), 7(K^)). This completes the proof of Proposition [3] 

Proposition [3] shows that there is no restriction to look an optimal precoder in the following set JC d 

K d = {K G /C, K H C T K diagonal} (43) 

This allows to formulate the evaluation of an optimal precoder as a t-dimensional optimization problem 
rather than a t 2 dimensional one. In order to state the corresponding result, we first slightly change our 
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notations. If K € Kd, the quantities 5(K), j(K), ... are actually functions of the entries of the diagonal 
matrix A = K H C r K. Therefore, for K G K d , S(K),5(K), . . . will be denoted 5(A), 5(A), . . .. 
The main result of this section is the following theorem. 

Theorem 2: Let Ct = UDU^ be the eigenvalues/eigenvectors decomposition of matrix Cr- Let 
A- pt = Diag(Ai j0p i, . . . , Xt,opt) be a positive diagonal matrix solution of the optimization problem 
Problem 1: Maximize Y?j=i 1°§2 + + \ i-J^{a)^[A) un( ^ er tne constraints 

A = Diag(A 1; . . . , A t ) > 0, ^(D^A) < 1 (44) 

Then, matrix K. pt defined by 

K opt = UD~V 2 A^ 2 (45) 

belongs to Kd, and maximizes I mmse- 

Proof. In order to prove Theorem 12 we consider a precoding matrix K G Kd, and denote A = 
Diag(Ai, A2, • • • , Aj) the diagonal matrix K^CtK. We assume that the diagonal entries (\j)j=i,..,,t 
of A are arranged in decreasing order. It is clear that K can be written as K = UD ' 

0A l/2 

where 

is a unitary matrix. As jTr(KK^) is supposed less than or equal to 1, matrices A and satisfy 
jTrD _1 0A0^ < 1. Each precoder K G /Q can thus be parameterized by the unitary matrix and the 
positive diagonal matrix A. As K G Kd, one can check easily that J mmse (A) reduces to | 1 J^(a)t"(a) • 
Therefore, Problem Q] is equivalent to the Problem 

* i»~ (i ■ \ *fk\\ 1 1 7(A)7(A) 



Problem 2: Maximize over A and 2~2j=i 1°§2 + -\? + 1 !_^4 7 (X)7(A) un( ^ er tne constraints 

A = Diag(Ai, . . . , A t ) > 0, ©unitary, -Trp^eA©^) < 1 (46) 

Let (A*, 0*) be a solution of the above problem. The diagonal elements of A* and D are arranged in 
decreasing order. Therefore (see the Appendix of Q), the following inequality holds 

-i-TrD- 1 ©^©? > ^TrD^A* (47) 

Inequality d4"71 ) implies that if (A*, ©*) is a solution of Problem |2j then, (A*, I) is a solution of Problem 
[TJ This shows that the optimization of I mmse is equivalent to Problem Q] This completes the proof of 
Theorem [2] 



Remark 1: We mention that it is not obvious that the singular vectors of the precoders that optimize 
the true mutual information I m mse have the structure (|45T ). To our best knowledge, this is still an open 
question. One of the merit of the present asymptotic analysis is thus to show that the use of precoders 
(|43T ) is relevant. 
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V. Maximization of T mmse . 

A. Maximization of I mmse in the case of Ltd. channels. 

Problem Q] cannot in general be solved in closed form. In this paragraph, we consider the case 
r = t, Cr = Ct = It for which some analytical results can be obtained. We establish in particular 
that the transmission scheme maximizing Immse is not a uniform power allocation along all the antennas, 
but an antenna selection scheme. This tends to indicate that in the context of i.i.d. channels, an antenna 
selection may provide higher values of I m mse than a uniform power allocation over the t available transmit 
antennas. Therefore, even in the simplest channels context, the maximization of Immse and of the usual 
Shannon mutual information / are different problems. 

A 1/9 

Theorem [2] implies that precoders K op t maximizing Immse can be written as K op t = A opt where A op t 
is solution of the problem 

Problem 3: Maximize Y?j=i 1°S + ^j^(A)J under the constraints A = Diag(Ai, . . . , At) > and 
|Tr(A) < 1 where 5(A) is the unique positive solution of the equation 

It is easy to check that 5(A) is the positive solution of (l48l in the particular context considered here. This 
justifies the statement of Problem [3] The solution of this problem is given in the following Proposition. 

Proposition 4: The diagonal entries of the optimal matrices A op t are either 0, either equal to - where 
s <t, the number of non zero entries of A op t, is the integer that maximizes 



s log 



1 + a 2 + y/(t/s - 1 + a 2 ) 2 + 4a 2 



(49) 



2a 2 

Proof. We first show that any optimal matrix A op t solution of Problem [3] verifies jTr(A op t) = 1. For 
this, we consider a positive diagonal matrix A for which jTt(A) < 1, and establish that if V is the 

positive diagonal matrix with normalized trace 1 defined by V = t^cttt A , then I mmse (T) > I mmse (A). 

t ^ ' 

For this, we show that function fj, — ► fi5(fiA) is strictly increasing on M + . We remark that 5(fiA) is the 
unique positive solution of the equation 

a 2 5+ 1 -Y^L = l (50) 
or equivalently, that fi5(^A) is the unique solution of the equation g^(p) = 1 where is defined by 

^ t ^[ l + X iP 
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For each p, function p — ► g^ip) is strictly increasing. Moreover, if p\ < p2, then g l _ Ll (p) > g^ 2 (p)- From 
this, we get immediately that p\5(p\A) < p25(p2A). We have thus shown that p — > p5(pA) is strictly 
increasing. We put /j = i^ A j . As ~Tr(A) < 1, /j is strictly greater than 1. Therefore, p5(pA) > 5(A) 
or £i<5(r) > 5(A). Hence, 

t t 

log (l + » A^(r)) > ^ log (l + A^(A)) 

3=1 3=1 

As the {pXj)j=i,...,t coincide with the diagonal entries of matrix T, the above inequality implies that 

^mmse(r) Immse(A). 

The above discussion shows that the constraint |Tr(A) < 1 can be replaced by jTr(A) = 1 in the 
statement of Problem [3] In order to characterize the solutions of the maximization problem, we replace 
the variables (Xj)j=i,...,t by the variables (xj)j=i,...,t defined by 

Xj = XjS(A) (51) 

for j = 1, . . . , t. We claim that the maximization of I mm ,se over the constraints Xj > for j = 1, . . . , t 
and jTr(A) = 1 is equivalent to the following problem 

Problem 4: Maximize Y^j=\ l°g(l + x j) under the constraints Xj > for each j = 1, . . . , t, and 



a 



.7=1 .7=1 J 



(52) 



Indeed, let (xj) :;= i v .. i f be positive numbers satisfying (l52l . and consider 5 = jYl l j=i x j an d = 
for j = 1, ... ,t. The matrix A = Diag(Ai, ■•• , Xf) is positive and satisfies |Tr(A) = 1. Moreover, 5 
is solution of the equation (|48T ). which implies that 5 = 5(A). Conversely, if A = Diag(Ai, . . . , At) is 
positive and satisfies jTr(A) = 1, the {xj)j=i,...,t defined by (I5TT) are positive and satisfy the constraint 
(l52l) . The conclusion follows from the observation that Y?j=i l°g (j- + Aj^(A)^ = Y?j=i log(l + Xj). 

The Karusch- Kuhn-Tucker (KKT) conditions provide necessary conditions for optimality. If x = 
(x\, . . . , x t ) T is a solution of Problem HJ then, it exists p for which 



<9£(x,/t) 

dxj 



if xj > 



< Oif x 



(53) 



3 



where £(x, p) is defined by 



£(x,/i) = ^log(l + 
j'=i 



1 

/i 



i=i 



i=i 



If > 0, we obtain 
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and if xj = 0, we have 



H < 1 + v 2 (54) 



If s < t is the number of non zero Xj's, we have also 

v = ° 2 +° 2 - s y L x i + - s Y,rh (55) 

or 

Using the identity -r^— = 1 — we get that the constraint (l52l can also be written as 



1 ' 

tE 



C7 2 - > ,r 



3 



t z 

/i is therefore given by 

/U = fT 2 + 2(J 2i (56) 

We also note that xj > is a solution of the equation 

a 2 x 2 + (2a 2 - fi) Xj + (1 + a 2 - fj,) = (57) 

If n > 1 + <7 2 , (1541 ) implies that s = i. Moreover, the equation (1571 ) has a single strictly positive solution 
y. Therefore, Xj = y for j = 1, . . . , t. Using the correspondence (1511) between x and A, we obtain 
immediately that A = I t . 

We now consider the case fi < 1 + a 2 . If n < 2a, equation (1571 ) has no real solution. Therefore, \x 
must be greater than 2a. The equation must have at least a positive solution. As 1 + a 2 — \i > 0, this 
implies that fi > 2a 2 . In sum, [i must be greater than max(2<r, 2a 2 ), and the equation (1571 ) has 2 positive 
solutions y\ and 1/2 given by 



M - 2ct 2 + V/i 2 - 4a 2 

yi = 



1)2 



2a 2 

li -2a 2 - yV 2 - 4a 2 



2a 2 

Therefore, each non zero Xj can be equal to y\ or to 1/2- We denote X-/ = yi} as | + u and 
' = 2/2} as I — u where u is an integer if s is even and u is the sum of 1/2 with an integer if s 

is odd. Note that if (xj)j=i t ... } t is a solution of Problem @] u must be positive because y\ > y2 and 

t 

log(l + Xj ) = (u + |) log(l + yi) + (u - |) log(l + y 2 ) 
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\ E5=i x i is § iven b y 

1 ^ s n — 2a 2 u \J — 4<t 2 
t^ Xj = t 2a 2 + ~t V 2 

3=1 

Plugging this into (l56l ) and solving the equation w.r.t. fx yields to 

This allows to express y\ and j/2 in terms of t, s, u, a 2 . After some calculations, we obtain that Ylj=i log(l+ 
xj) = (« + §) log(l + yi) + («-§) log(l + j/ 2 ) is given by 

* s 1 

^log(l + 2^) = - log — +ulog 

where b 2 is defined by 

b 2 = 16 " 2 

(t - S + (T 2 S) 2 

It is easily seen that the righthandside of d59l ), considered as a function of u, is increasing on R + . 
Therefore, it is maximum for u = |. This implies that #{j,Xj = y±} = s and #{j,Xj = 7/2} = 0. 
Moreover, the righthandside of d59l for u = s/2 coincides with ( |49l . This completes the proof of 
Proposition HI 



Vl + b 2 u 2 + 1 



s[\ + 6 2 n 2 - 1 



(59) 



We now check numerically that for certain values of c 2 , s does not coincide with t. In figure [3j we 
have considered the case r = t = 8, and have represented the values of I m mse for s = 6 and s = 8. It 
is clear that if the SNR is greater than 8 dB, then s = 6 provides higher values of I mmse . The values 
of Immse and I m mse are still higher for s = 6 rather than for s = 8. This confirms that the antenna 
selection scheme may be better than the uniform power allocation across all the transmit antennas. Figure 
@]represents I mm se, Immse, Immse versus s when the SNR is equal to !5dB, and demonstrates that s = 6 
is the optimum value of I mm se- 

We note that if s ^ t, function I mmse reaches its maximum at different points because there are more 
than one diagonal matrix whose entries are either either -. Function I m mse is thus a non concave 
function of the precoding matrix. Using the trick introduced in JH, it is possible to establish that I mmse 
is itself, in general, non concave. 

B. Study of Problem\J\ 

We consider again the optimization of I m mse in the bi-correlated case. Theorem [2] shows that the 
determination of an optimal precoder K opt needs to solve the optimization Problem Q] As this problem 
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2 4 6 8 10 12 14 16 18 

SNR (dB) 



Fig. 3. Relevance of the antenna selection scheme, s = 6 versus s = 8 



26 




'1 2345678 
s 



Fig. 4. Relevance of the antenna selection scheme, SNR = 15 dB 

cannot be solved in closed form, we use a gradient algorithm. We propose to parameterize Xj by Xj = ag 
in order to get rid of the constraint Xj > 0, and to use a standard gradient algorithm with projection 
on the constraint j X^=i T" — 1 at eacn iteration. Note that the convergence of this algorithm towards 
a global maximum of I mmse is not guaranteed because this last function is probably non concave in 
general. 
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No precoder 




Max. I m7nse without predefined struct. 




Max. /„„„„ 




Max. /,„„,„ 




— Max. /„„„„ 







2 4 6 8 10 12 14 16 18 

SNR (dB) 



Fig. 5. Impact of preceding scheme 



C. Numerical illustration 

We present some simulation results to illustrate the impact of the precoder optimization scheme in the 
case r = t = 4. Ct and Cr are generated according to model d38l ). In the present numerical experiment, 
(v<t> T ,<h) = (0-5, |) and (o- (j)R ,4> R ) = (0.4, £). 

In figure [51 we provide the mutual informations Immse (evaluated using Monte Carlo simulations, 1000 
channel realizations are used) corresponding to the following precoding schemes: 

• (i) No precoding 

• (ii) Maximization of I mmse (K) for precoders structured as in (05J 

• (iii) Maximization of I mmse (K) for precoders structured as in (l45l ) 

• (iv) Maximization of I mmse (K) for precoders structured as in (l45l ) 

• (v) Maximization of I mmse (K) when the precoders have no particular structure. 

The various maximizations are achieved by the gradient algorithm with projection on the relevant 
constraint. Note that the gradients of I mmse (K) w.r.t. the parameters (otj)j=i,.. t an d w.r.t. the entries of 
K have no closed form expression. At each iteration of the algorithm, they are evaluated by Monte Carlo 
simulations (1000 channel realizations are used). This explain why the direct maximization of I m mse 
leads to very high computational cost algorithms. 

We now comment figure |5J We first compare precoding schemes (iv) and (v). The two curves match 
perfectly, showing that in practice, the structure (|45l ) seems to optimize I mmse (K) even for r = t = 4. 
The comparison of schemes (ii) and (iii) indicates that the use of the improved approximation I mm se 
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allows to obtain significant gains for SNRs greater than 10 dB. We finally observe that the precoding 
schemes (ii) and (iv,v) provide very close mutual informations when SNR < 2 dB and SNR > 10 dB. 
Finally, the comparison of (i) (no precoding) with the other schemes shows that the precoding allows to 
increase significantly I mmse . 

We finally compare the processing time (on a 1.83GHz Intel) needed by schemes (ii), (Hi), (iv) 



Precoding scheme 


Processing time (s) 


(ii) maximization of I mmse 


0.39 


(iii) maximization of I mm se 


0.25 


(iv) maximization of I m mse 


337.6 



It is seen that the processing times needed to implement schemes (ii) and (iii) are almost 1000 times 
smaller than in the context of scheme (iv), while the use of the improved approximation l mmS e instead 
of Immse does not lead to a significant increase of the computational cost. 

VI. Concluding remarks. 

We summarize the advantages of our asymptotic analysis of I mmS e- It first allows to prove the relevance 
of precoders K = UD _1 / 2 A 1//2 , where A is a positive diagonal matrix. Second, the entries of the opti- 
mum matrix A are solution of an optimization problem that can be solved by a computationally attractive 
gradient algorithm. If, in contrast, matrix A was designed to maximize the true mutual information I mmse , 
the corresponding gradient algorithm would have a high computational cost. This is because this function 
of A, as well as its derivatives w.r.t. the entries of A, cannot be expressed in closed form. They have to 
be evaluated by Monte Carlo simulations, thus complicating a lot the maximization algorithm. 

Acknowledgements. The authors thank Aris Moustakas for suggesting that the relative error of the 
approximation I mmse was aO(p) term and not a O(^) term. Useful discussions with Walid Hachem 
and Jamal Najim are also acknowledged. 

Appendix A 
Proof of Proposition [2] 

The proof of Proposition [2] uses extensively the Nash-Poincare inequality as well as an integration by 
part formula valid in the Gaussian random matrices context. The combined use of these two tools was 
introduced recently by Pastur in |[T4l in the context of simple models. This method was used in order 
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to evaluate the asymptotic behaviour of the Shannon capacity of bi-correlated Rayleigh MIMO channels 
in O and of bi-correlated Rician MIMO channels in @. We however notice that Proposition [2] has not 
been established in [6] and [5]. 

Let <&(Y) be a function of the entries of matrix Y defined by d9]). Then, under certain extra assumptions 
on (see |6]), the following Nash-Poincare inequality holds true: 



t r 



Var($(Y))<^j>4,-E 



S*(Y) 



+ 



d$(Y) 



dY 



(60) 



i=i j=i 

where Y%j represents the complex conjugate of Y$ We also recall that the integration by part formula 
gives 



E[y M $(Y)] 



dpdq 
t 



E 



<9$(Y) 

dY. 



and 



E [F M *(Y)] = ^E 



a$(Y) 



dY 



(61) 



(62) 



We first establish (fT8l ). For this, we first introduce some notations. (3 is defined by (3 = 4Tr(DQ) and 
a = E(/5). R is the r x r diagonal matrix given by 



R 



a 2 L, + aD 



5 is defined by a = jTr(DQ), and R is the t x t diagonal matrix given by 

R= [a 2 (l t + dD)]" 1 
If x is a random variable, x represents the random variable x = x — E(x). 



(63) 



(64) 



Using calculations similar to 0, section 4.1, we obtain that 



E((Qy,) fc Y M ) = ^-^E(Q 



t 1 + adj 



k,i) 



" J E(/3(Q yi )*Y y 



for each k,i,j. Summing over j gives 



1 + ad 



/ o 



E ((QYY H )* )f ) = a 2 d ia K(Q ktl ) - a 2 K ^/3(QYDRY H ) M 

Plugging the resolvent identity (see Eq. (10) of Q) 

6(k - i) (QYY H ) k ^ 



into (1651) . we obtain 



E(Q*,i 



Qk,i 



8(k-i) 



(7 



dioE(Q kyi ) +E ^(QYDRY H ) fe/ , 



(65) 



(66) 
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Solving w.r.t. E(Q fe)i ), we get 

E(Q fc)i ) = Ri,iS(k -i) + <r 2 E ^(QYDRY H R)j; r 
If u is a deterministic unit norm row vector, we eventually obtain 

E(uQu^) = uRu ff + cr 2 E fJuQYDRY ff Ru fl j (67) 
We now prove that the second term of the righthandside of ([67]) can be bounded by a 0(^72) term 

o 

independent of u. As E(/3) = 0, the Schwartz inequality gives 



E(( (SuQYDRY^Ru^ 








2\ 1/2 


< (e 


(3 





Var ( uQYDRY^Ru^ 

1/2 



1/2 



(68) 



Using the first item of £[5]> in the case M = D, we get that ^E|/3| 2 J = 0{\). In order to study the 
behaviour of the second term of the righthandside of (l68l ). we establish the following lemma. 

Lemma 2: Let A be a uniformly bounded diagonal deterministic matrix, u a unit norm deterministic 
row vector, and v a uniformly bounded deterministic row vector. Then, 



Var(uQYAY H v H ) < - 



C 



(69) 



where C is a constant independent of u, v, and A. 

Proof. In order prove the lemma, we use the Nash-Poincare inequality (l60l) in the case 3>(Y) 
uQYAY^v^. We define rj as 77 = uQYAY^v^. Straightforward calculations lead to 



uQy, (Q YAY W v H ), + A jjVi uQy, 



We establish that 



Or] 



dY 



> j 



< C 



(70) 



(71) 



where C is a constant independent of u, v, A. (1701 . | < || A|| and the Schwartz inequality imply that 

2 

|2 11 a Il2n?i. ./-».. |2 , on? I\ ( rws \~\sH, r H\ |2|._/-v„ 1 2\ 



E 



977 



dY 



Summing over i,j yields 



drj 



dY 



< 2|7; i | 2 ||A|| 2 E|uQy/ + 2E(|(QYAY w v w ) i | 2 |uQy j | 2 ) 



< 2||A|| 2 ||v|| 2 E||uQY|| 2 + 2E (||QYAY H v H || 2 ||uQY|| 2 ) 



E(||uQY|| 2 ) = E(uQYY H Qu // ). Using the resolvent identity d66j, we obtain that QYY H = I-a 2 Q. 
Therefore, QYY^Q = Q-cr 2 QQ and QYY^Q < Q. This implies that ||uQY|| 2 < uQu 11 . As matrix 
Q satisfies Q < \, we obtain that 



|uQY|| 2 < 



(72) 
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In order to prove (TTTb . it is thus sufficient to verify that E(||QYAY^v^|| 2 ) < C where C is a constant 
independent of v and t. For this, we remark that 

||QYAY H v H || 2 < 



vYA H Y H YAY H v ff 



A straightforward but tedious calculation gives 



E(vYA H Y H YAY H v H ) = vD 2 v 



2„H 



Tr(A^D 



+ vDv^-TrD -TrfA^DAD) 

t t v ' 



As matrices A,D,D and vector v are uniformly bounded, we obtain that E(||QYAY^u^|| 2 ) < C. 
This, in turn, implies (P7T1 ). One can show similarly that 



dYi 



< C 



As the (di)i = x i4 4 4 1 and the (cL), = i ... r are uniformly bounded (see (fl4l). ( f60b provides immediately (f69b - 
Lemma|2]is thus established. 



467) and 468) imply that 



C 



|u(E(Q)-R)u"|<^ 



In order to complete the proof of (1181) . we use Theorem 3 of H, and obtain that 



-Tr(DR) = - 

t y ' t 

or a = 5 + O(-p). It is easy to check that 



1 



-Tr(DR) = -Tr(DT) + Oi-* 



\Iha - J-i,i\ < — 5- lot - d\ 
hi ^ c 



(73) 



Therefore, maxj 1^ - T ifi \ < S, and |u(R - T)u H | < p. Using 473), we eventually get ( fl8t 



We now establish d20~b . For this, we first prove the following lemma. 
Lemma 3: 

1 1 



E(Q Aii Qw,; 



-E 



i 1 + ad % 
Proof. We first note that 466) yields 



(QDQ) fe j (QYDRY 



Ik' A 



+ E 



1 



/ 9(QYDRY ff ) fcli — — T Q 



1 + fid; 



E(QmQ^) = "^2 E ( (QYY il ) M Q fc/ji , 



(74) 



(75) 
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In order to be able to express E ( (QYY- ff )fc ji Q fc , # ) , we evaluate 



e ( (Qy.Ofe^-Qfc,^ ) = ^E(Q fciP y Pii y iii Q jfc , ii ,; 

p=i 



For this, we use the integration by parts formula (loTT) in the case $(Y) = Qfc^YijQ^j,, and obtain 



^rcvo ^ ' / "' / '::MQ,,,Q / , ( (Qy ; ),T, /i (76) 



-E(Q w ,(Qy i ) fc Q fc , f ,y i|i ) 



O 

Summing over p, and expressing /3 = jTr(DQ) as (3 = a + /3 provides 



E 



^E(Q fc>i Q fc , ii ,) - ^E [(QDQ) feji ,(Q yi )^y 



Solving w.r.t. E 



(Qyi)fc^ijQfc'i' 



(Qyj)fc^,iQfe',i' 

and summing over j gives 



(77) 



2 



(78) 



(7 



-E 



(QDQ) fc ^(QYDRY H ) fc ) i 



cr 2 E 



0Qk',i' (QYDRY^) fcj . ( 



Plugging d75) into dVSj and solving w.r.t. E(Q fe jQ fe , ^) gives d74l 



We define 77 by 77 = uQu ff . d74l) yields immediately 



E(t]Y = E( W ) = — E ( uQDQu a uQYDRY^Ru^ ) + a 2 E 



uQYDRY^Ru^ 



(79) 



We define p x and p 2 by pi = uQDQu^ and p 2 = uQYDRY ff Ru H . The term E (uQDQu^ uQYDRY^Ru* 
is given by 

E fuQDQu H uQYDRY^Ru^) = E(pi)E(p 2 ) + ^Cp\P 2 ) 



In order to evaluate E(pj), % = 1, 2, we state the following Lemma 
Lemma 4: 



E(QDQ) 



diTi 



k,i 



c 

< — 
t 



1 — (7 4 77 

Let A be a uniformly bounded diagonal deterministic matrix. Then, 

E(QYAY ff ) M - a 2 d^Tr(ATD)T fc , 



C 
< — 
t 



(80) 



(81) 
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The proof uses again the resolvent identity (1661 ). the Nash-Poincare inequality, the integration by parts 
formula, Theorem 3 in 0, and is omitted. 

Using Lemma (0]), we get that 

C 



* 2 E( Pl )E(p 2 ) - (uT 2 Du«) 2 
1 — 



< 
~ t 



We verify that E(p 1 /) 2 ) is a 0(^72) term. We first remark that, as p 1 < then \°Pi\ < 2^^. The 

Schwartz inequality gives 

|e(pip 2 )I < 2 -^4- (% 2 rj < ^2 

by Lemma |2] Finally, we show that E(/377p 2 ) is a 0(7^72) term. We express this term as 

E(/^p 2 ) = E A)E(p 2 ) + E(te) 
Lemma |4] implies that E(p 2 ) is uniformly bounded, while the Schwartz inequality gives E(/3?7) = 0(7^72 )■ 

° o o 

In order to evaluate E(/37/y0 2 ), we write 

E(te) = E(/? w ° 2 ) - E(77)E(^p 2 ) 
As 77 < ^m., the Schwartz inequality gives immediately that 



Putting all the pieces together completes the proof of (|20 



In order to establish (|2TT> . we first need to prove the following lemma. This lemma will also be useful 
to establish Lemma Q] below. 

Lemma 5: Let M be a uniformly bounded deterministic matrix. Then, 

1. 



E 

Moreover, 



t TYM (Q - E(Q)) 



1 



(82) 



sup E|u(Q -E(Q))u ff | 8 < - (83) 

u,||u||=l " 

We denote by p the random variable p = jTrM (Q — E(Q)). E|p| 4 can be written as 

E|p| 4 = (E|H 2 ) 2 + Var(p 2 ) 
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The first item of (fl5l) implies that (IE| p| 2 ) 2 = O(^). In order to evaluate Var(/? 2 ), we use the Nash- 
Poincare inequality in the case <&(Y) = p 2 . 



dp 2 



1 dQp.q 



M„ 



Therefore, 



E E 



d P 2 



-2p|E M Qi, 9 (Qy;)pM g , p 
-2piEp(QM)i J ,(Qy i ) P 
-2p i (Q M Qyj)i 

4^ (|p| 2, n:(Y H QM H Q 2 MQY)) 



Matrix M^Q 2 M is uniformly bounded. Therefore, 

Tr(Y i/ QM H Q 2 MQY) < C TrY H QQY) = C Tr(QQYY^) 
As QYY H = I - a 2 Q < I, we obtain that 

Tr(Y H QM // Q 2 MQY) < C Tr(Q) 

Hence, 



E E 



5p 2 



<9Y 



< yE hp| 2 ±T*(Q)J <jE(\p\ 2 ) 



As E(|p| 2 ) = 0{h), this implies that 



We obtain similarly that 



E E 



E E 



dp 2 



dY 



i 3 ' 



dp 2 



dY,, 



h3 



t 3 



(l82l ) follows immediately from the Nash-Poincare identity. 



In order to prove (|83l ), we first establish that 



u,||u||=l 

and 



sup E|u(Q -E(Q))u H | 4 < % (84) 

t z 



.H|6 . C 



sup E|u(Q-E(Q))u w | <- (85) 

U,||lt||=l " 

We consider a deterministic unit norm row vector u and denote by 77 the term r\ = u (Q — E(Q)) u^. 
E|r/| 4 = (E|??| 2 ) 2 + Var(r/ 2 ). (|20]> implies that (E|ry| 2 ) 2 < p where C is a constant which does not 
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depend on t and u. In order to evaluate the term Var(r/ 2 ), we use the Nash-Poincare inequality in the 
case $(Y) = rf. 



drj 2 
W 



-2?7uQy i (Qu 



hi 



Therefore, 



E E 



drj 2 



dY 



hi 



4E (|??| 2 uQYY Qu uQ 2 u K ) 



(ED, Q < ^2, and E|r?| 2 < f imply that 



E E 



'■j 



drj 2 



We obtain similarly that 



E E 



drj 2 



C 



dY 



hi 



< 



C 
< — 
~ t 



The Nash-Poincare inequality eventually gives Var(ry 2 ) < p. We have therefore proved (l84l ). In order 
to establish ([85]), we write E|r/| 6 = (E|??| 3 ) 2 + Var(r? 3 ). The Holder inequality and d84]) imply that 
(E|r/| 3 ) 2 < p. The term Var(ry 3 ) is also evaluated using the Nash-Poincare inequality. 

^t- = -3ri 2 uQ yj (QvL% 



and 



E E 



hi 



drf 



9E (H 4 uQYY^Qu^ uQ 2 u H ) 



As uQYY^Qu^u and uQ 2 u H are uniformly bounded, (l84l ) implies that 



E E 



i-.i 



drf 



dY 



< 



Similarly, 



E E 



hi 



hj 



drj 3 



C 

T 2 



dY, 



hi 



C 



851 ) follows immediately from the Nash-Poincare inequality. 
Starting from E|r/| 8 = (E|ry| 4 ) 2 + Var(ry 4 ), (1831) is proved similarly. 



In order to establish (|2TT ). we introduce the following notations: 

p 1>k = v fc QDQvf ,p2 ik = v fc QYDRY^Rvf , % = v fc Qvf 
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Using (1791 ) and Lemma [4] it is easy to check that 



^TVar^ife) - - f\ - ^( Kfc v fc T 2 Dvf f = E ( few^l J + 0( : 

7.. ^ 7. \ V 7- / / ' 



It therefore remains to show that 



For this, we write /j 2 ,fc = ^2 fe + E(p2,fc)- Therefore, 



0(- 



(86) 



E 



/? I ^ K fc E(p 2 ,fc)^A 



+ E 



V fc y 



(87) 



The term E 



P Efc K fc E(p 2 ,fc)?7, 



matrix defined by 



can also be written as E I (3 Tr(MQ) where M is the deterministic 



M = ^K fe E( / 02,fc)vfv fe 



151) thus implies that E 



Lemma @] implies that sup fc |E(p2fc| < C. Therefore, matrix M is uniformly bounded. The first item of 

o' 2 

Tr(MQ) = 0(1). Similarly, E|/?| 2 = O(^) holds. The Schwartz inequality 

shows that E ^/3Tr(MQ)^ = 0{\). 

In order to evaluate the second term of the righthandside of (l87l ). we remark that 

Wkh,k)\ < (E|P 2 ,,| 2 ) 1/2 (1E|^| 4 ) 1/4 (E|%| 4 ) 1/4 

Lemma E] implies that (El/^ 2 ) 1 / 2 = O(^), and dH gives (E\?) k \ A ) l/4 = O(^). As (E|/? fc | 4 ) 1/4 = 
0(j) by ([82]>, we get that 

sup|E(/%p 2fc )| = 0(72) 

This, in turn, implies that the second term of the righthandside of (|8"7T ) is a 0(j) term. This completes 
the proof of (ITU 



We finally prove ( fT9l . We just sketch the proof because similar arguments have been used in order to 

000 

establish Lemma [3] We evaluate E(Q jfci j 2 Qfc 3 j 3 ) for each integers k%, &2, 13, Aft). We first 

o o 

calculate E(Qfc t j 2 Qfc 3 i3 ). For this, we use the resolvent identity (l66l) and write 



(QYY H ) fcljil Q fc2ii2 Q fc3ii 



o o 
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Using the integration by parts formula as in the proof of Lemma [3l we obtain that 



4e 



o o 



(QYY^) feliil Q fe2) , 2 Q fe3) , 



o o 



IE 



-E 



(QDQ) fcl>i2 (QYDRY^) fe2)il Q fc3)i3 
(QDQ) fclii3 (QYDRY^) fc3iil Q fc3ii2 
^(QYDRY^) fciil Q fc2i , 2 Q fc3i , 3 



O O 



Plugging (l66t into the above equation and solving w.r.t. E(Q/ Cl; j 1 Q fc2 i 2 Qk 3 ,i 3 )> we obtain that 



E(Qfc 1 ,i 1 Q fcai i a Q fcS) 



+^E 



+^E 



R fel)fel 5(fei-H)E(Q feii2 Q fc3 , i3 ) 



(QDQ) feli42 (QYDRY^R) fc2jJl Q fc3jl3 
(Q D Q)fe,i 3 (QYDRY^R) feil Qk 2 ,i 2 
^(QYDRY^R)^^ Qfe 3 ,i 3 Qfc 3 ,i 3 



+<r 2 E 

o o 

Writing E(Q fcl)il Q fe2ji2 Q fe3)i3 ) as 

E(Q fcl , il 4 2 , i 2 4 3 ,i3)= IE (QA ;i , il ) E (Qfc 2 ,i 2 Q fc 3,i3)+ E (Qfc 1 , ll Qfc 2 ^Qfc3, J 



o o 



o o 



and using d73K we obtain that 

o o o 



t e 



+ ^E 



+<7 2 E 



(QDQ) fcl i2 (QYDRY^R) fc2 il Q fc3) 
(QDQ) felji3 (QYDRY^R) &3iil Q fe2)i2 

o o 

r Hj>\, . n n _|_ ( 



(88) 



/3(QYDRY^R) fciiil Q fc2)i2 Q fc3j43 

We consider a unit norm deterministic row vector u and define 77 = uQu, = uQDQir^ and p 2 
uQYDRY^Ru^. Using (EU), we get that 



,o3 x 2<T 2 



O . 

We write E(pip 2 77) as 



E(t7 ) = — E(p lP2V ) + o*nPm 



E(pip2»7) = E(p 1 )E(p 2 r ? ) +E(p 2 )E(p 1 r ? ) +E(p 1 p 2 r 7 ) 



E(pi) is uniformly bounded while E(p 2 ?7) is a O(j) term. E(p 2 )E(p 1 r7) is a O(j) term for the same 
reasons. Finally, we remark that |p x | < 2^2. . Therefore, E(p 1 p 2 rj) is a O(j) term, and 2 ^-E(pip 2 r^) is 
a O(p-) term. 
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° o2 

In order to evaluate E(/3p2f? ), we write 

E(/W) = E( P2 )E(^ 2 ) 
E(p2) is uniformly bounded. E(/3r; ) = O(p-) holds by the Schwartz inequality. We finally write that 

|E(^ 2 )|<( E |p 2 | 2 )) 1/2 (e|^| 4 )) 1/4 (e|^|«)) 1/4 

° o o 2 

and use Lemma [5] to justify that E( ( 9p 2 J 7 ) = 0{h)- This completes the proof of ([191 . 

Appendix B 
Proof of Lemma [TJ 

We first establish that 

E(log(l + ej)) 2 <C (89) 
for some constant C independent of j and t. For this, we remark that 

''QT.tf = 7^ (90) 

is less than 1. Therefore, — E(log(<7 2 QTj,j)) > 0. As log(l+ej) is equal to log(a 2 QT,j,j)— E(log(cr 2 QTjj)), 
we get that 

log(l + e,) > log(a 2 Q TJj )) = - log(l + 0j) 

/3j > implies that log(l + /3j) < Pj. Therefore, log(l + €j) > -pj and (log(l + ej)) 2 < (Pj) 2 . In order 
to prove (|89l , it is thus sufficient to establish that E(/3 2 ) < C. We denote by hj the column j of matrix 

lh II 2 

H. Pj is upperbounded by the match filter bound " ^ ■ Using the expression of vector hj in terms of 
matrices Cr, Ct and Hud, it is easy to check that 

E(||h#)<C 

for some constant C independent of j and t. This completes the proof of (l89l ). Note that d89l implies 
that for each j, E| log ( 1 + ) | < oo, a property which was implicitely assumed in the proof of TheoremQ] 

We now complete the proof of Lemma Q] We consider a constant A G (0, 1), and express log(l + ej) 

as 

log(l + £j ) = \ S \<A + log(l + Zj)\A>A 



k=l 
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where for any set B, Ig is equal to 1 on B and outside B. This leads to the following expression of 



E(r,) =^(-l) fc - 1 E ( |l N<A )-E ( £j I h| > A )+lE (eJl, e .,> A )-ilE (efl N > A )+E (log(l + e,)I |ej| > A ) 

(91) 

Using (EHJ) and ([83]>, we remark that 

EN 8 < | (92) 
From the Markov inequality and the Holder inequality, we obtain that 

P(\e 3 \>A)<^ (93) 

and 

e((M 6 )<£ 
e((M 4 )<P 

E(| ei | 3 ) < ^ 
K(|6,p)<¥ 

By the Schwartz inequality, 

^(^k.l^)! < (^(kil > ^)) 1/2 (Ek,| 2 ) 1/2 
(l93l) and (l94l) thus imply that |E (ejI| e -|>A) | i s upperbounded by We obtain similarly that 

and 

E( e fl| £j |> A ) < ^ 
Using d89l ), d93l and the Schwartz inequality yields 

|E(log(l + ei )I| e3 |>^)|<^ 

We now establish that 

00 (\e\ k \ C 

^ E ( ! l*l<* (95) 
For fc > 4, E (|ej|*I| ej | <A ) < vl fc - 4 E(|e i | 4 ). Therefore, 



fc=4 v 7 \fc=0 



„ 4 ) 



(94) 



As A < 1, FPi < °o so that d95l ) follows from d94l ). Putting all the pieces together gives 

\E(rj)\ < %. 
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